The influence of pitch and noise on the discriminability of filterbank features

نویسندگان

  • Malcolm Slaney
  • Michael L. Seltzer
چکیده

Most features used for speech recognition are derived from the output of a filterbank inspired by the auditory system. The two most commonly used filter shapes are the triangular filters used in MFCC (mel-frequency cepstral coefficients) and the gammatone filters that model psychoacoustic critical bands. However, for both of these filterbanks there are free parameters that must be chosen by the system designer. In this paper, we explore the effect that different parameter settings have on the discriminability of speech sound classes. Specifically, we focus our attention on two primary parameters: the filter shape (triangular or gammatone) and the filter bandwidth. We use variations in the noise level and the pitch to explore the behavior of different filterbanks. We use the Fisher linear discriminant to give us insight about why some filterbanks perform better than others. We observe three things: 1) there are significant differences even among different implementations of the same filterbank, 2) wider filters help remove the non-informative pitch information, and 3) the Fisher criteria helps us understand why. We validate the Fisher measure with speech recognition experiments on the Aurora-4 speech corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...

متن کامل

Classification of Iranian Traditional Music Dastgahs Using Features Based on Pitch Frequency

The Iranian traditional music is composed of seven majors Dastgahs: Chahargah, Homayoun, Mahour, Segah, Shour, Nava, and Rast-Panjgah. In this paper, a new algorithm for the classification of the Iranian traditional music Dastgahs based on pitch frequency is proposed. In this algorithm, the features of Lagrange coefficients of pitch logarithm (LCPL), Fuzzy similarity sets type 2 (FSST2), and th...

متن کامل

On the relevance of auditory-based Gabor features for deep learning in robust speech recognition

Previous studies support the idea of merging auditory-based Gabor features with deep learning architectures to achieve robust automatic speech recognition, however, the cause behind the gain of such combination is still unknown. We believe these representations provide the deep learning decoder with more discriminable cues. Our aim with this paper is to validate this hypothesis by performing ex...

متن کامل

Pitch and Time, Tonality and Meter

We examined how the structural attributes of tonality and meter influence musical pitch-time relations. Listeners heard a musical context followed by probe events that varied in pitch class and temporal position. Tonal and metric hierarchies contributed additively to the goodness of fit of probes, with pitch class exerting a stronger influence than temporal position (Experiment 1), even when li...

متن کامل

Noise Robust Pitch Tracking by Subband Autocorrelation Classification

Pitch tracking algorithms have a long history in various applications such as speech coding and extracting information, as well as other domains such as bioacoustics and music signal processing. While autocorrelation is a useful technique for detecting periodicity, autocorrelation peaks suffer ambiguity, leading to the classic “octave error” in pitch tracking. Moreover, additive noise can affec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014